deep reinforcement learning approach
A Deep Reinforcement Learning Approach to Automated Stock Trading, using xLSTM Networks
Sarlakifar, Faezeh, Asl, Mohammadreza Mohammadzadeh, Khaledi, Sajjad Rezvani, Salimi-Badr, Armin
Traditional Long Short-Term Memory (LSTM) networks are effective for handling sequential data but have limitations such as gradient vanishing and difficulty in capturing long-term dependencies, which can impact their performance in dynamic and risky environments like stock trading. To address these limitations, this study explores the usage of the newly introduced Extended Long Short Term Memory (xLSTM) network in combination with a deep reinforcement learning (DRL) approach for automated stock trading. Our proposed method utilizes xLSTM networks in both actor and critic components, enabling effective handling of time series data and dynamic market environments. Proximal Policy Optimization (PPO), with its ability to balance exploration and exploitation, is employed to optimize the trading strategy. Experiments were conducted using financial data from major tech companies over a comprehensive timeline, demonstrating that the xLSTM-based model outperforms LSTM-based methods in key trading evaluation metrics, including cumulative return, average profitability per trade, maximum earning rate, maximum pullback, and Sharpe ratio. These findings mark the potential of xLSTM for enhancing DRL-based stock trading systems.
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Towards Opinion Shaping: A Deep Reinforcement Learning Approach in Bot-User Interactions
Siahkali, Farbod, Samadi, Saba, Kebriaei, Hamed
This paper aims to investigate the impact of interference in social network algorithms via user-bot interactions, focusing on the Stochastic Bounded Confidence Model (SBCM). This paper explores two approaches: positioning bots controlled by agents into the network and targeted advertising under various circumstances, operating with an advertising budget. This study integrates the Deep Deterministic Policy Gradient (DDPG) algorithm and its variants to experiment with different Deep Reinforcement Learning (DRL). Finally, experimental results demonstrate that this approach can result in efficient opinion shaping, indicating its potential in deploying advertising resources on social platforms.
A Deep Reinforcement Learning Approach to Battery Management in Dairy Farming via Proximal Policy Optimization
Ali, Nawazish, Shaw, Rachael, Mason, Karl
Dairy farms consume a significant amount of electricity for their operations, and this research focuses on enhancing energy efficiency and minimizing the impact on the environment in the sector by maximizing the utilization of renewable energy sources. This research investigates the application of Proximal Policy Optimization (PPO), a deep reinforcement learning algorithm (DRL), to enhance dairy farming battery management. We evaluate the algorithm's effectiveness based on its ability to reduce reliance on the electricity grid, highlighting the potential of DRL to enhance energy management in dairy farming. Using real-world data our results demonstrate how the PPO approach outperforms Q-learning by 1.62% for reducing electricity import from the grid. This significant improvement highlights the potential of the Deep Reinforcement Learning algorithm for improving energy efficiency and sustainability in dairy farms.
- Food & Agriculture > Agriculture (1.00)
- Energy > Renewable > Solar (1.00)
- Energy > Power Industry (1.00)
- Energy > Energy Storage (1.00)
Kinematics Modeling of Peroxy Free Radicals: A Deep Reinforcement Learning Approach
Nayak, Subhadarsi, Shalu, Hrithwik, Stember, Joseph
Tropospheric ozone, known as a concerning air pollutant, has been associated with health issues including asthma, bronchitis, and impaired lung function. The rates at which peroxy radicals react with NO play a critical role in the overall formation and depletion of tropospheric ozone. However, obtaining comprehensive kinetic data for these reactions remains challenging. Traditional approaches to determine rate constants are costly and technically intricate. Fortunately, the emergence of machine learning-based models offers a less resource and time-intensive alternative for acquiring kinetics information. In this study, we leveraged deep reinforcement learning to predict ranges of rate constants (\textit{k}) with exceptional accuracy, achieving a testing set accuracy of 100%. To analyze reactivity trends based on the molecular structure of peroxy radicals, we employed 51 global descriptors as input parameters. These descriptors were derived from optimized minimum energy geometries of peroxy radicals using the quantum composite G3B3 method. Through the application of Integrated Gradients (IGs), we gained valuable insights into the significance of the various descriptors in relation to reaction rates. We successfully validated and contextualized our findings by conducting cross-comparisons with established trends in the existing literature. These results establish a solid foundation for pioneering advancements in chemistry, where computer analysis serves as an inspirational source driving innovation.
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Connecticut > New Haven County > Wallingford (0.04)
- Health & Medicine > Therapeutic Area (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (0.46)
- Education > Health & Safety > School Nutrition (0.44)
Improving Environment Robustness of Deep Reinforcement Learning Approaches for Autonomous Racing Using Bayesian Optimization-based Curriculum Learning
Banerjee, Rohan, Ray, Prishita, Campbell, Mark
Deep reinforcement learning (RL) approaches have been broadly applied to a large number of robotics tasks, such as robot manipulation and autonomous driving. However, an open problem in deep RL is learning policies that are robust to variations in the environment, which is an important condition for such systems to be deployed into real-world, unstructured settings. Curriculum learning is one approach that has been applied to improve generalization performance in both supervised and reinforcement learning domains, but selecting the appropriate curriculum to achieve robustness can be a user-intensive process. In our work, we show that performing probabilistic inference of the underlying curriculum-reward function using Bayesian Optimization can be a promising technique for finding a robust curriculum. We demonstrate that a curriculum found with Bayesian optimization can outperform a vanilla deep RL agent and a hand-engineered curriculum in the domain of autonomous racing with obstacle avoidance. Our code is available at https://github.com/PRISHIta123/Curriculum_RL_for_Driving.
- Instructional Material > Course Syllabus & Notes (0.47)
- Research Report > Promising Solution (0.34)
- Transportation (0.49)
- Information Technology (0.35)
A Deep Reinforcement Learning Approach for Interactive Search with Sentence-level Feedback
Zhou, Jianghong, Ho, Joyce C., Lin, Chen, Agichtein, Eugene
Interactive search can provide a better experience by incorporating interaction feedback from the users. This can significantly improve search accuracy as it helps avoid irrelevant information and captures the users' search intents. Existing state-of-the-art (SOTA) systems use reinforcement learning (RL) models to incorporate the interactions but focus on item-level feedback, ignoring the fine-grained information found in sentence-level feedback. Yet such feedback requires extensive RL action space exploration and large amounts of annotated data. This work addresses these challenges by proposing a new deep Q-learning (DQ) approach, DQrank. DQrank adapts BERT-based models, the SOTA in natural language processing, to select crucial sentences based on users' engagement and rank the items to obtain more satisfactory responses. We also propose two mechanisms to better explore optimal actions. DQrank further utilizes the experience replay mechanism in DQ to store the feedback sentences to obtain a better initial ranking performance. We validate the effectiveness of DQrank on three search datasets. The results show that DQRank performs at least 12% better than the previous SOTA RL approaches. We also conduct detailed ablation studies. The ablation results demonstrate that each model component can efficiently extract and accumulate long-term engagement effects from the users' sentence-level feedback. This structure offers new technologies with promised performance to construct a search system with sentence-level interaction.
- Europe > United Kingdom > England > West Midlands > Birmingham (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
- Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)
Benchmarking Robustness of Deep Reinforcement Learning approaches to Online Portfolio Management
Velay, Marc, Doan, Bich-Liên, Rimmel, Arpad, Popineau, Fabrice, Daniel, Fabrice
Deep Reinforcement Learning approaches to Online Portfolio Selection have grown in popularity in recent years. The sensitive nature of training Reinforcement Learning agents implies a need for extensive efforts in market representation, behavior objectives, and training processes, which have often been lacking in previous works. We propose a training and evaluation process to assess the performance of classical DRL algorithms for portfolio management. We found that most Deep Reinforcement Learning algorithms were not robust, with strategies generalizing poorly and degrading quickly during backtesting.
- North America > United States > New York > Monroe County > Rochester (0.04)
- Europe > France > Île-de-France > Paris > Paris (0.04)
- Information Technology (1.00)
- Banking & Finance > Trading (0.95)
Autonomous Agent for Beyond Visual Range Air Combat: A Deep Reinforcement Learning Approach
Dantas, Joao P. A., Maximo, Marcos R. O. A., Yoneyama, Takashi
This work contributes to developing an agent based on deep reinforcement learning capable of acting in a beyond visual range (BVR) air combat simulation environment. The paper presents an overview of building an agent representing a high-performance fighter aircraft that can learn and improve its role in BVR combat over time based on rewards calculated using operational metrics. Also, through self-play experiments, it expects to generate new air combat tactics never seen before. Finally, we hope to examine a real pilot's ability, using virtual simulation, to interact in the same environment with the trained agent and compare their performances. This research will contribute to the air combat training context by developing agents that can interact with real pilots to improve their performances in air defense missions.
- South America > Brazil (0.06)
- North America > United States > Florida > Orange County > Orlando (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- (3 more...)
Practical Deep Reinforcement Learning Approach for Stock Trading
Xiong, Zhuoran, Liu, Xiao-Yang, Zhong, Shan, Yang, Hongyang, Walid, Anwar
Stock trading strategy plays a crucial role in investment companies. However, it is challenging to obtain optimal strategy in the complex and dynamic stock market. We explore the potential of deep reinforcement learning to optimize stock trading strategy and thus maximize investment return. 30 stocks are selected as our trading stocks and their daily prices are used as the training and trading market environment. We train a deep reinforcement learning agent and obtain an adaptive trading strategy. The agent's performance is evaluated and compared with Dow Jones Industrial Average and the traditional min-variance portfolio allocation strategy. The proposed deep reinforcement learning approach is shown to outperform the two baselines in terms of both the Sharpe ratio and cumulative returns.
- North America > United States (0.04)
- North America > Canada > Quebec > Montreal (0.04)